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Abstract 

Fibre properties and the biochemical composition of cell walls are important traits in many applications. For example, 
the lengths of fibres define the strength and quality of paper, and lignin content is a critical parameter for the use 
of biomass in biofuel production. Identifying genes controlling these traits is comparatively difficult in woody spe- 
cies, because of long generation times and limited amenability to high-resolution genetic mapping. To address this 
problem, this study mapped quantitative trait loci (QTLs) defining fibre length and lignin content in the Arabidopsis 
recombinant inbred line population Col-4xLer-0. Adapting high-throughput phenotyping techniques for both traits for 
measurements in Arabidopsis inflorescence stems identified significant QTLs for fibre length on chromosomes 2 and 
5, as well as one significant QTL affecting lignin content on chromosome 2. For fibre length, total variation within the 
population was 208% higher than between parental lines and the identified QTLs explained 50.58% of the observed 
variation. For lignin content, the values were 261 and 26.51%, respectively. Bioinformatics analysis of the associated 
intervals identified a number of candidate genes for fibre length and lignin content. This study demonstrates that 
molecular mapping of QTLs pertaining to wood and fibre properties is possible in Arabidopsis, which substantially 
broadens the use of Arabidopsis as a model species for the functional characterization of plant genes. 
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Introduction 

Wood properties are of great significance to the multitude of 
nations that produce pulp and paper products. Variation in 
pulp properties, notably fibre morphology and physical prop- 
erties such as fibre length, width, strength, and coarseness, 
have a profound influence on the properties of paper products 
(Dinwoodie, 1965; Page and Seth, 1980). In woody plants 
used for pulp production, the traditional method of improving 
fibre quality has been through tree breeding (Via et al, 2004). 
Recent work has shown that woody plant fibre cell dimensions 
and strength are under genetic control (Yu et ah, 2001). Lignin, 



a major chemical component of wood, also plays an impor- 
tant role in defining the properties of pulp-based products. 
The lignin content of pulp fibres has to be carefully controlled 
in order to balance paper strength and optical qualities, and 
removal of lignin in the pulping process is both costly and envi- 
ronmentally detrimental (Odendahl, 1994; Biermann, 1996). 
Moreover, lignin content is also a critical parameter in the gen- 
eration of biofuels from cellulosic materials (Mansfield et ah, 
2012). Cellulosic biomass, for example from woody species, 
has become increasingly important as a source of bioenergy 



Abbreviations: BAC, bacterial artificial chromosome; BLUR best linear unbiased predictor; CIM, composite interval mapping; FQA, fibre quality analyser; QTL, 
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that is not in competition with food production (Somerville 
et al, 2010). 

Genetic engineering has demonstrated that direct gene manip- 
ulation can improve biomass production or wood properties in 
woody species (Eriksson et al., 2000; Pilate et al, 2002; Ranocha 
et al., 2002; Bjurhager et al., 2010; Li et al, 2011; Zawaski 
et al, 2011), but due to strong resistance in the marketplace to 
the use of genetically modified organisms, it would be preferable 
to obtain similar results through traditional breeding strategies. 

Long generation times, variable growth conditions, and com- 
paratively scarce genomic resources have limited the use of 
high-resolution genetic mapping in tree breeding, although each 
of these limitations is addressed in emerging tree genomic sys- 
tems, such as Populus trichocarpa (Tuskan et al, 2006; Jansson 
and Douglas, 2007). A genetic model plant such as Arabidopsis 
thaliana, on the other hand, is well suited for genetic studies 
involving large numbers of individuals over multiple genera- 
tions, because of its small size and short life cycle. Furthermore, 
a large number of Arabidopsis recombinant inbred line (RIL) 
populations are readily available from stock centres and some 
of them have been thoroughly characterized genetically (Reiter 
etal, 1992; Lister and Dean, 1993). Arabidopsis RIL populations 
have been used for fast and accurate identification of numerous 
quantitative trait loci (QTLs) controlling traits such as circadian 
rhythm (Swarup et al, 1999), the effects of various nutrients and 
minerals (Bentsink et al, 2003; Harada et al, 2004; Harada and 
Leigh, 2006; Reymond et al, 2006; Waters and Grusak, 2008; 
Zeng et al, 2008; Ghandilyan et al, 2009), resistance to preda- 
tors (Pfalz et al, 2007), stress responses (McKay et al, 2008), 
biomass accumulation (Lisec et al, 2008), heterosis (Lisec et al, 
2009), and salicylic acid pathway responses (Alcazar etal, 2009). 

The initial identification of QTLs in Arabidopsis has enabled 
subsequent detailed genetic studies to unravel the molecular 
mechanisms controlling the traits associated with the QTLs, and 
these insights have often contributed to a better understanding 
of angiosperm plant biology in general (El-Assal et al, 2001; 
Kroymann et al, 2003; Mouchel et al, 2004; Masle et al, 2005; 
Werner et al, 2005; Bentsink et al, 2006; El-Din Zhang et al, 
2006). 

Here, we have explored the feasibility of detecting QTLs in 
an Arabidopsis RIL population for the traits of fibre length and 
lignin content in inflorescence stems. We have described the use 
of high-throughput phenotyping protocols for both traits and the 
molecular mapping of significant QTLs controlling fibre proper- 
ties and lignin content in any plant species. 

Materials and methods 

Plant material 

A set of 98 RILs derived from a cross between Columbia (Col-4) and 
Landsberg erecta (Ler-0) (Lister and Dean, 1993) was obtained from the 
ABRC (Stock number CS1899). Seeds were surface sterilized, plated on 
0.5 X MS agar medium (2.1 g I MS salts, lOg I sucrose, 8g I agar, 
pH 5.8) and vernalized for 48 h at 4 °C. Seedlings were transferred to 
Professional Pro-Mix 'BX'/Mycorise Pro (Premier Horticulture, Riviere 
du Loup, Canada) supplemented with 0.3% (v/v) Nutricote 14-14-14 
Type 100 fertilizer (Chisso-Asahi Fertilizer Co., Tokyo, Japan) after 
10 d of continuous light. Plants were grown under a 16h/8h day/night 
cycle at 21 °C day and 18 °C night temperature. Light was maintained at 



200 umol s~ 2 nT 2 . Post-flowering inflorescence stems were collected for 
fibre quality analyser (FQA) processing or lignin content measurement. 
The set of lines was grown five times, in five separate experiments, and 
the resulting stems were analysed separately. Three sets of plants were 
used for fibre length measurements and two sets of plants were used for 
lignin content measurements. A single plant per line was used in the first 
FQA experiment and two plants per line were bulked in all subsequent 
experiments. 

Isolation of fibres from plant stems 

Samples (10 mg) of air-dried stem material taken from the bottom 5 cm 
of one stem or each of two stems, node and internode, were placed in a 
20 ml test tube for the pulping reaction. The stem sample was compacted 
using a glass rod before adding 2 ml each of distilled water and acetic 
acid (glacial) and heating in a boiling water bath for 2 min. After add- 
ing 2 ml of 30% hydrogen peroxide, the samples were returned to the 
boiling water bath for 90 min. The resulting solution was then carefully 
decanted to retain the cooked stem tissue within the tube, and the tissue 
was rinsed gently with distilled water three times to remove residual 
reagents. The delignified stem tissue was transferred to a screw-cap 
conical plastic centrifuge tube with 35 ml of distilled water and then agi- 
tated vigorously to disintegrate fibre bundles and form a homogenous 
fibre suspension. The suspension was subsequently filtered through a 
Britt Dynamic Drainage Jar using 3 1 of water, stirred using an over- 
head stirrer at 200 r.p.m. (TAPPI, 1992) to collect fibres retained on 
the 200-mesh screen (105 urn opening). The retained fibres were rinsed 
off the filter mesh, collected in a 50 ml centrifugal tube and diluted to a 
total volume of 50 ml with distilled water. The fibre suspensions were 
inspected visually and fibre bundles, if present, were removed manually 
using a fine needle. Triplicate measurements were performed on 10 mg 
samples taken from individual stems of the same line. 

Fibre quality analyser measurements 

Fibre length was determined using a FQA (OpTest; Hawkesbury, ON, 
Canada) with a cytometric flow cell and image analysis system capable of 
rapidly and accurately measuring fibre curl, kink, and length distributions 
(Olson et al., 1995; Roberston et al, 1999). Five millilitres of the fibre 
suspension was dispensed into a 600 ml plastic beaker for FQA analysis. 
This sample was then diluted automatically to exactly 600 ml by the FQA. 
The fibre input was adjusted to a targeted events s measurement range of 
25^10 fibres s . The length weighted fibre length is reported. 

Removal of extractives 

The inflorescence stems of Arabidopsis accessions were first ground 
using a microball mill to pass through a 80-mesh screen. Prior to extrac- 
tion, the ground samples were dried in a vacuum oven at 40 °C for 
48 h and conditioned in a vacuumed desiccator over phosphorus pen- 
toxide overnight. The ground stems were then extracted using a rapid 
extraction by washing method. The procedure was based on Morrison's 
quick extraction method (Morrison, 1972) with a reduction in sample 
particle size and weight, and passage through a fine filtration membrane. 
Approximately 0. 1 g of dried 80-mesh sample was put in a test tube, 
soaked with 15 ml of distilled water, heated in a water bath at 65 °C 
for 30 min with occasional shaking, and then filtered hot through a dry 
and pre-weighed 0.45 |im nylon membrane using a Millipore filter. The 
residue was first washed with 20x2 ml of deionized water and then 
sequentially with 20 x 1 ml each of ethanol, acetone, and diethyl ether. 
The residue was then transferred to a pre-weighed aluminium pan in 
preparation for lignin content measurement. 

Lignin content measurement 

The procedure used was as described by Chang et al. (2008). Dried 
80-mesh samples (5 ± 1 mg) of the ground and extracted Arabidopsis 
steins were weighed to the nearest 0.0 lmg, and then digested with 
1.0 ml of 25% acetyl bromide in acetic acid in a 70 °C water bath for 
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30min with shaking at lOmin intervals. After cooling at room tem- 
perature, the samples were stored in an ice bath for 5 to 120min, and 
5.0 ml of acetic acid was added to each sample. After mixing, 30 ul ali- 
quots of the mixed content, in triplicate, were transferred to the wells 
of a 96-well quartz microplate. Transfer of the samples to the micro- 
plate was completed within 5 min before sequentially adding 40 ul of 
1.5M NaOH, 30 ul of 0.5 M hydroxylamine hydrochloride and 150 ul 
of acetic acid to each well using a ten-channel multiple pipette. The 
absorbance of the solutions in the wells at 280 nm was measured using 
a Perkin-Elmer Wallac 1420 microplate reader. A blank was included to 
correct for background absorbance by the reagents. 

Data analysis 

Data analysis was carried out using the R statistical language (R 
Development Core Team, 2009). Broad sense heritability was estimated 
using the analysis of variance component in the lme4 package. Variance 
components for the lines and for the experiments were calculated with 
the model: 

lmer(TRAIT~ (1|LINE) + (1 1 EXPERIMENT)) 

and used to calculate the heritability with the formula: 

H 2 = Var(lines)/[Var(lines)+Var(Residual)] 

Best linear unbiased predictor (BLUP) values for each line were also 
calculated with the lme4 package. Variance components for the lines 
and the experiments were calculated using the same model as for the 
heritability, and the random effects for each line were extracted with the 
ranef function. 

A composite interval method was performed with the R/qtl (Broman 
et al, 2003) analysis package. The imputation method was selected, 
with 256 draws, an error of 0.001, and 2.5 cM steps. The composite 
interval mapping was done with the imputation method, with three co- 
variables and a 10 cM window: 

cim(cross, mefhod='imp', window=10, n.marcovar=3) 

Calculations were performed on each experimental data set inde- 
pendently and also on the calculated BLUP values for fibre length and 
lignin content using the same method. The significant thresholds were 
determined by performing 1000 permutations. The multiple QTL model 
was devised using the fitqtl function. Results obtained from the compos- 
ite interval mapping analysis were used to construct the QTLs and their 
positions were used in a simple additive model. The resulting model was 
used to assess the effect of each QTL on the trait (percentage of explana- 
tion of the observed total variance). 

Analysis of variance (ANOVA) was performed to test the association 
of the genotype of selected markers with the observed trait, using the 
gee and multcomp packages. 

Candidate gene selection 

The list of all protein-encoding genes included in the intervals delimited 
by FQ2 and FQ5 (see Results) was screened for genes with enriched 
expression in the second-internode data set of the Bio-Array Resources 
for Plant Biology's expression browser (http://bar.utoronto.ca/welcome. 
htm). The selected cut-off was 1.5 on the log 2 scale. Additional genes 
were also retained for their known connection to fibre and cell-wall 
development or auxin/gibberellin connection. The selected list of genes 
was then screened for mismatches at the amino acid level between the 
resequenced Col-0 genome (Cao et al, 2011) and the Ler-0 genome 
(Gan et al, 2011), using the Blastx algorithm (Altschul et al, 1997). 
Genes showing at least one change at the amino acid level were consid- 
ered as candidates. For the LQ2 interval (see Results), the same method 
was applied, but additional genes were limited to lignin metabolism/ 
synthesis. 



Results 

Fibre length measurements 

The Col-4xLer-0 RIL population created by Lister and Dean 
(1993) was chosen to produce the stem tissues used for the anal- 
ysis. To collect material for fibre length measurements, the RIL 
population was grown in three independent biological samples. 
A single, central inflorescence stem from each line was used in 
the first experiment, referred to as Fibre 1, whilst two stems from 
two individuals per line were pooled together in the two subse- 
quent experiments, referred as Fibre2 and Fibre3. Great care was 
taken to minimize contamination from other cell types, typically 
leading to a very clean fibre preparation (data not shown). The 
FQA analysis yielded considerably higher fibre length values in 
the Col-4 parental line than in the Ler-0 parental line, as shown 
by a t-test (P <0.001), the mean value across the three experi- 
ments being 0.977 and 0.757 mm, respectively. 

The fibre length values determined across the entire set of RILs 
showed a wider range than the values observed for the paren- 
tal lines, which indicated the possibility of transgressive seg- 
regation (Fig. 1). The shortest measured fibres within the RILs 
were 0.621mm long, 83.9% of the minimum Ler value, whilst 
the longest were 1.287 mm, 122.5% of the longest Col value 
(Table 1). Broad sense heritability for the trait was estimated to 
be 62.3%. The histograms from the individual experiments for 
Fibrel, Fibre2, and Fibre3 (Fig. 1A, B, and 1C, respectively) 
showed a similar structure. In each, a main peak was associated 
with shorter fibres, as indicated by the median values in Table 1, 
with a trailing end towards the longer values. 

Lignin content measurements 

For lignin content measurements, the RIL population was grown 
twice to harvest material for biological replicas, labelled Lignin 1 
and Lignin2. Measured differences in lignin content between 
the parent lines were inconclusive; a 2% difference was seen 
between the parents in Ligninl but almost none in Lignin2 
(Table 2). This was confirmed by a Mest, which did not show a 
statistically significant difference between the parents (P >0.60). 
The mean value of both experiments was a lignin content of 
19.65% (w/w) (Table 2). 

In contrast, the mean lignin content values in individual lines 
of the RIL population ranged from 13.13 to 20.98%. Broad-sense 
heritability was estimated at 32.3%, possibly indicating a rela- 
tively limited genetic influence. The distribution of lignin con- 
tent values in the Ligninl and Lignin2 datasets was very similar, 
as shown in Figs 2 A and B. 

QTL analysis for fibre length 

A genetic map based on the segregation of 676 single- 
feature polymorphisms (SFPs) in the Col-4xLer-0 RIL pop- 
ulation (Singer et al, 2006) was used to perform a QTL 
analysis using the fibre length values of 98 lines of the RIL 
collection, first by interval mapping (data not shown) and 
then by composite interval mapping (CIM) on each data set 
independently, using R/qtl (Jansen, 1993; Zeng, 1993, 1994; 
Jansen and Stam, 1994). 
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Fig. 1. Distribution of fibre length values (mm) in the three 
experiments: Fibrel (A), Fibre2 (B), and Fibre3 (C). 

Two significant fibre length QTLs (P <0.05) were mapped to 
similar genome positions in all three data sets (Fibre 1 , Fibre2, and 
Fibre3): based on the chromosome to which they each mapped, 
they are referred to hereafter as FQ2 and FQ5, respectively. 



Table 1. Fibre length measurements (mm) in the RILs and Col 
and Ler parents. 
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Table 2. Lignin content measurements in the RILs and Col and 
Ler parents All values are given as the percentage of lignin based 
on unextracted weight (w/w). 



Experiment Parents RIL 



Lignin Mean Mean Median Range SD 
content 



Ligninl 



Lignin2 



Col 
Ler 



Col 
Ler 



17.74 
19.73 



20.40 
20.75 



18.73 



20.57 



17.09 17.15 13.13- 1.39 
19.39 



18.07 18.09 13.80- 1.64 
20.98 



FQ2 was located on the bottom arm of chromosome 2, with 
the maximum LOD score positioned between 49.5 and 55 cM 
(Fig. 3) in the map defined by Singer et al. (2006). The maxi- 
mum LOD score was observed in the Fibre2 data set at 9.09, 
and LOD scores in Fibrel and Fibre3 were also highly signifi- 
cant, with LOD scores of 8.63 and 5.67, respectively (Fig. 3). 
The extent of the QTL varied slightly between the data sets, but 
the overall confidence interval could be calculated based on a 2 
LOD score drop from the summit of the peak. This placed FQ2 
between 43.3 and 57 cM, an interval covering 21 SFP markers 
(Fig. 3, Table 3). 

The second significant QTL, FQ5, was located in the distal 
part of the top arm of chromosome 5. The maximum LOD score 
was found between 3 and 8 cM, with a maximum value of 9.90 
in the Fibrel experiment (Fig. 3). The estimated extent of FQ5 
with a 2 LOD score drop was between 1.5 and 11.6 cM, which 
encompassed 17 SFP markers (Table 3). 

In order to assess the accuracy of the mapping, BLUP 
values were calculated from the three experiments and used 
to perform the CIM calculation, producing results very similar 
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Fig. 2. Distribution of the values for lignin content (%, w/w) in the 
two experiments: Ligninl (A) and Lignin2 (B). 



to those obtained with each individual experiment, as shown 
in Fig. 3. 

Fig. 4 showed that the measured fibre length was strongly cor- 
related with the genotype of two markers (C2 081 and C5 008) 
in all three experiments, a fact further confirmed by ANOVA 
(Supplementary Table 1 at JXB online). These markers were 
selected as proxies for FQ2 and FQ5, as they were both found 
as part of the interval identified in all three Fibre experiments. 
Plants having the Ler allele of FQ2 and the Col allele of FQ5 
displayed the largest fibres (mean 0.947-1. 086 mm), whilst the 
opposite allele combination yielded the shortest fibres (mean 
0.764-0.821 mm). Fibre lengths in individuals possessing purely 
Col or Ler alleles at these sites fell in between these extremes. 

Further analysis with the R/qtl package using a multiple 
QTL model (Haley and Knott, 1992; Sen and Churchill, 2001) 
was performed to estimate the contribution of each QTL to the 
observed variance, based on a model devised for each data set. 
In the three data sets, the contribution of FQ5 to the total vari- 
ance (23.96-29.96%) was consistently higher than that of FQ2 
(15.91-22.18%). Importantly, the estimated combined contribu- 
tion of both QTLs amounted to approximately one-half of the 
total variance (45.79-50.58% depending on the data set). 

QTL analysis for lignin content 

The genetic map by Singer et al. (2006) was also used to 
locate QTLs controlling lignin content. Interval mapping and 
CIM revealed a strong LOD score peak on chromosome 2 in 
both independent experiments, indicative of a significant QTL, 
henceforth referred as LQ2 (Fig. 5). The margins, defined by a 2 
LOD score drop, located the QTL between 35.3 and 47.9 cM, an 
interval containing 23 SFP markers. As for the fibre experiment, 
BLUP values were calculated from both lignin experiments and 
used to perform the same CIM calculation. The results obtained 
with the BLUPs were very similar to those seen with indi- 
vidual experiments (Fig. 5). A multiple QTL model was again 
used to determine the contribution of the significant QTL to 
the observed variance (Table 4), and this revealed that between 
24.80 and 26.5 1% of the measured variance in lignin content was 
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Table 3. Summary of the QTLs identified by composite interval 
mapping The positions and marker intervals are based on the map 
of Singer ef a/. (2006). 
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Marker interval 


Position 
(cM) 


LOD 


Fibre length 










FQ2 Fibrel 
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C2_079-C2_089 


53.4 


8.63 


Fibre2 


2 


C2_075-C2_091 


52.4 


9.09 


Fibre3 


2 


C2_071 -C2_088 


49.1 


5.67 


FQ5 Fibrel 
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10.07 


Fibre2 


5 


C5_002-C5_008 


3.0 


8.85 


Fibre3 


5 


C5_008-C5_01 8 


8.0 


7.27 


Lignin content 










LQ2 Ligninl 


2 


C2_067-C2_077 


45.9 


5.37 


Lignin2 


2 


C2_056-C2_072 


40.8 


5.84 



attributable to LQ2 in the two independent experiments. Sorting 
the individual lines based on a single marker (C2 067) showed 
that individuals carrying the Ler allele of this marker had, on 
average, less lignin (16.50-17.47%) than individuals carrying 
the Col allele (17.92-19.00%) (Fig. 6), a fact also confirmed by 
ANOVA (Supplementary Table IB). The marker C2 067 was 
selected as a proxy for the QTL LQ2, as it belonged to the inter- 
val defined in both Lignin experiments. 

Annotated genes within the QTL intervals 

As all the SFP markers used in the construction of the Singer 
et al. (2006) genetic map correspond to positions in the fully 
sequenced genome of Arabidops is, all annotated genes within 
the QTL intervals identified by CIM could readily be identified. 
The QTL FQ2 was flanked by the markers C2 071 and C2 091, 
which corresponded to the genes At2g28060 and At2g37050 
according to the AGI version 10 of the Arabidopsis genome. 
This interval is 3.6 Mb long and contains 1005 loci encoding pro- 
teins. The second significant fibre length QTL FQ5, flanked by 
C5 002 and C5 018, was located in a 2.7 Mb interval between 
At5g01150 and At5g08580, which contains 789 protein-encod- 
ing genes. Finally, LQ2, the QTL controlling lignin content, was 
located between the markers C2 056 and C2 077, correspond- 
ing to At2g23900 and At2g30240, a 1.12 Mb interval that con- 
tains 672 protein-encoding loci. 

Although our genetic approach could link entirely uncharac- 
terized genes to fibre development, it is be worthwhile mining 
the identified intervals for plausible candidate genes, as the use 
of increasing genomic resources could further narrow down the 
selection and accelerate the cloning of the respective QTLs. On 
the basis of currently available resources, we selected candi- 
date genes in the following way. First, genes with an associa- 
tion with fibre and cell-wall development were compiled. This 
list was amended for genes involved in auxin or gibberellin 
metabolism and signalling, two hormones with roles in fibre 
development (Zhong and Ye, 2001; Dayan et al., 2010; Ragni 
etal., 2011). At the same time, the Bio-Array Resources for Plant 
Biology's expression browser (Toufighi et al., 2005) was used 
to identify genes whose expression was enriched in the stem 
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Fig. 4. Box plots of fibre length sorted by two markers each linked 
to QTLs for the Fibrel (A), Fibre2 (B), and Fibre3 (C) experiments. 
The box plots were drawn after sorting the population according 
to the genotype of the markers C2_081 and C5_008, showing 
the difference in average fibre length (mm) of each combination. 
The graphs clearly show that individuals with the combination Ler 
C2_082 and Col C5_008 had the longest fibres, whilst individuals 
with the combination Col C2_082 and Ler C5_008 had the 
shortest fibres. The x-axis shows the genotype of the markers 
C2_082 and C5_008 and they-axis shows fibre length. 

second-internode dataset. Finally, all the selected genes were 
screened for predicted amino acid polymorphisms between the 
Columbia and Landsberg erecta alleles. This selection resulted 
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Fig. 5. Distribution over chromosome 2 of the LOD score for 
lignin content. LOD was calculated independently by CIM for 
the two lignin experiments using a 10 cM window and three 
covariates. The LOD score for Ligninl is in green, Lignin2 in pink, 
and the BLUP in khaki. The red line shows the most conservative 
statistically significant threshold (P <0.05), calculated for Ligninl 
as 4.39. Under the LOD score curve are indicated the intervals 
obtained from Ligninl and Lignin2 in their respective colours. 

Table 4. Summary of the QTL in the multiple QTL mapping 
model QTLs are identified by their location. The % variance is the 
percentage of the total variance explained by the QTL. The LOD 
score given is for the entire model. 



QTL Experiment 


% Variance 


P value 


LOD score 


Fibre length 








FQ2 Fibrel 


22.18 


3.33^™ 




FQ5 Fibrel 


28.40 


1 .03^ 9 


13.76 


FQ2 Fibre2 


21.83 


3.39^ 9 




FQ5 Fibre2 


23.96 


8.03^ 10 


15.46 


FQ2 Fibre3 


15.91 


6.87^™ 




FQ5 Fibre3 


29.96 


480 e-09 


11.83 


Lignin content 








LQ2 Ligninl 


26.51 


7.45^ 7 


5.49 


LQ2 Lignin2 


24.80 


4.71 6-06 


4.70 



in 51 genes for FQ2 and 27 for FQ5. A complete list is provided 
in Supplementary Table 2 (at JXB online). 

Some of the selected genes were highly conspicuous. For exam- 
ple, a gene involved in the metabolism or deposition of cell-wall 
polymers other than cellulose, FRAGILE FIBER 81 IRREGULAR 
XYLEM 7 (FRA8IIRX7) encoding a glycosyltransferase involved 
in the synthesis of xylan, could affect fibre length (Brown et al. , 
2007). Remarkably, the fra8 mutant also displays thinner sec- 
ondary cell walls in fibres (Zhong et al, 2005). Among the genes 
found in the FQ2 interval, there were several members of the 
CELLULOSE SYNTHASE-LIKE (CSL) family: CSLA7, CSLB1, 
-B2, -B3, and -B4, and CSLD1. CSLD1 is a gene whose activity 
is important for pollen-tube cell-wall development and growth 
(Bernal et al, 2008), while CSLA7 encodes a protein reportedly 
involved in the production P-mannans (Liepman et al, 2005). 
The function of P-mannans within the cell wall remains unknown, 
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Fig. 6. Box plots of the lignin content values sorted according to a 
QTL-linked marker foe the Ligninl (A) and Lignin2 (B) experiments. 
The box plots were drawn after sorting the population according 
to the genotype of the marker C2_067, showing the difference in 
average lignin content. The graphs clearly show that individuals 
with the Ler marker had the lowest lignin content. 

but CSLA7 activity is crucial for proper embryo development. 
A csla7 null allele causes arrest during embryogenesis, disrupt- 
ing normal cell division and patterning of the embryo, preclud- 
ing further analysis of the protein function at later stages (Goubet 
et al, 2003). However, the precise functions of the CSLB genes 
have not been investigated. Also, genes involved directly in cel- 
lulose synthesis deposition or related processes, whose knockout 
mutant phenotypes may affect many tissues, may quantitatively 
influence fibre properties in their allelic variation, as found in 
FQ2 and FQ5. The TRICHOME BIREFRINGENCE (TBR) gene 
is located inside FQ5. TBR is involved in synthesis of the sec- 
ondary cell wall, as is its homolog TBR-LIKE 3 (TBL3), which 
is also present in FQ5. Mutant tbr and tbl3 plants display reduc- 
tions in crystalline cellulose levels but not in total cellulose con- 
tent (Bischoff et al, 2010). Furthermore, more members of the 
TBL gene family were found in FQ5 (TBL35) and FQ2 (TBL43, 
45). In addition, lignins play major roles in determining the 
mechanical properties of the cell wall, and mutants affected in 
lignin biosynthesis or deposition may display fibre abnormali- 
ties. Among the candidate genes identified were four laccase- 
encoding genes: one in FQ2, LAC2, and three on FQ5, LAC10, 
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-11, and -12. Laccases are ubiquitous copper oxidases, present in 
fungi and plants (Messerschmidt and Huber, 1 990) and have been 
used for their lignin-degrading activity in fibre pulping (Mayer 
and Staples, 2002). In Arabidopsis, the laccase gene family com- 
prises 17 members, of which only two have been subjected to 
detailed study: LAC4 and -17. Mutations in these genes result 
in reduced general lignification, a phenotype particularly pro- 
nounced in the interfascicular fibres in the lacl 7 mutant (Berthet 
et ah, 201 1). Another gene involved in lignin metabolism in FQ2 
is the cinnamate 4-hydroxylase (C4H) At2g30490. A mutation in 
this gene, reduced epidermal fluorescence 3 (re/3), results in col- 
lapsed stem vasculature and reduced lignin content (Schilmiller 
etal, 2009). 

A link between fibre defects and auxin transport defects is 
plausible in the light of cell-dimension changes in auxin-trans- 
port-inhibited plants (Mattsson et ah , 1999) and has recently been 
supported by the walls are thin 1 (watl) mutant in Arabidopsis 
(Ranocha et al., 2010), as well as by a mutation in the LIKE 
AUXIN RESISTANT 1 (LAX1) gene, which is located in FQ5 and 
encodes an auxin influx carrier of the AUX1 family (for review, 
see Kramer, 2004). Moreover, the ATP BINDING CASSETTE Bl 
gene encoding an ABC transporter, also part of an auxin trans- 
port system, is located in the FQ2 interval (Wu et al, 2010). 
Gibberellins are also known to influence cell dimensions and 
Dayan et al. (2010) demonstrated a role of gibberellins in the 
development of fibres through modulation of GA 2-oxidase and 
GA 20-oxiydase activity. These findings prioritize GA 2-ox-dase 
in FQ2 (GA20X3) and GA 20-oxydase in FQ5 (GA20OX3) as 
candidate genes for further studies. 

Other genes deserve close attention because of their related 
mutant phenotypes. The SCAR/WAVE 1 encodes an activator of 
the ARP2/3 complex responsible for the nucleation of the actin 
filaments and, together with the other SCAR genes, is necessary 
for normal stem growth (Zhang et al, 2008). BELLRINGERJ 
PENNYWISE (BLR/PNY) encodes a BELLI -like transcription 
factor directly involved in the patterning of the inflorescence 
stem. A brl/pny mutant shows defects in shoot development 
and displays an altered stem anatomy (Byrne et al, 2003; 
Smith and Hake, 2003). BLR/PNY can control internode cell 
elongation through modulation of the activity of the pectin 
methylesterase PME5 (Peaucelle et al, 2011). Interestingly, 
At2g36700 and At2g367210 encode a pair of pectin methyl- 
esterases and At5g04970 encodes a pectin methylesterase inhibi- 
tor, although no particular phenotypic functions are known for 
them. FOLYLPOLYGLUTAMATE SYNTHETASE 1 (FPGS1) is 
a plastidial enzyme belonging to a small gene family. Whilst the 
fpgsl mutant has no discernable phenotype, the double mutant 
of fpgsl and fpgs3 is dwarfed, indicating a role in stem growth 
(Mehrshahi etal, 2010). 

Finally, another group of genes located in the FQ5 interval 
might also affect fibre properties through their roles in cell pro- 
liferation or cell division. One of these is the CELL-DIVISION 
CYCLE 48C (CDC48Q gene, one of three CDC48 genes in 
Arabidopsis. The roles of CDC48 genes in cell division and 
cytokinesis have been characterized in detail (Rancour et al, 
2002; Park et al, 2008). Furthermore, the SIAMESE (SIM) pro- 
tein has a cyclin-binding motif and interacts with several cell- 
cycle regulators (Kasili et al, 2010; Van Leene et al, 2010). SIM 



controls the endoreduplication in trichomes and other organs 
(Walker et al, 2000). 

Eighteen genes from the LQ2 interval were selected as can- 
didate genes because of suspected or demonstrated impact on 
lignin biosynthesis or by their enriched expression in the stem 
second- internode in association with predicted amino acid pol- 
ymorphisms between Columbia and Landsberg erecta. Among 
these, the CCR6 gene, At2g23910, encodes a putative cinnamoyl 
CoA reductase, a member of a family of oxidoreductases that 
includes enzymes responsible for the final step in monolignol 
biosynthesis (for review, see Boerjan et al, 2003). Studies in 
tobacco and eucalyptus have linked CCRs and lignin biosynthe- 
sis (Lacombe et al, 1997; Ralph et al, 1998). Although the pre- 
cise function of CCR6 is unknown, it is most strongly expressed 
in immature seeds, and microarray surveys have shown it to 
be co-expressed with flavonol biosynthesis genes (Yonekura- 
Sakakibara et al, 2008). Among the 11 CCR genes annotated 
in Arabidopsis, only CCR1 (Atlgl5950) has been characterized 
extensively. A ccrl mutant seems to suffer from reduced lignin 
content in the stem (Jones et al, 2001), although more recent 
work suggests that this phenotype may be the result of delays in 
lignin deposition rather than an inability to execute the reaction 
(Patten et al, 2005; Laskar et al, 2006). 



Discussion 

Whilst QTLs affecting fibre properties have been mapped in 
several plant species, including maize (Cardinal et al, 2003; 
Krakowsky et ah, 2006, 2005), sorghum (Shiringani and Friedt, 
2011), and, most notably, cotton (Mei et al, 2004; Chee et al, 
2005a,Z>; Draye et al, 2005; Ulloa et al, 2005; Qin et al, 2008; 
Chen et al, 2009; Paterson et al, 2011; Zhang et al, 2011), 
application of similar approaches to trees has been hindered by 
the length of time required for the tree to mature, combined with 
the difficulty in predicting final wood properties in mature trees 
based on the properties of juvenile wood. 

Although QTL mapping in crop plants and trees directly 
benefits breeding efforts, molecular access to relevant genes 
can be achieved much faster in genetic model plants, such as 
Arabidopsis, where biological properties and genomic resources 
enable unmatched genetic resolution and molecular access to 
underlying genes, although it can suffer from its strength, as its 
simplicity might mask complex epistatic interactions found in 
larger crop genomes. Furthermore, Arabidopsis has also demon- 
strated its suitability for providing molecular access to conserved 
plant biological mechanisms (Liepman et al, 2010; Wienkoop 
et al, 2010; Zhang et al, 2011), and, as an arboreal growth 
habit is not a monophylogenetic trait but has arisen indepen- 
dently many times in both angiosperm and gymnosperm line- 
ages (Bowe et al, 2000; Chaw et al, 2000; Soltis et al, 2002), 
our understanding of the molecular mechanisms operating in the 
cells of trees is expected to benefit from genetic discoveries in 
Arabidopsis as much as in other plants. 

In this study, we explored the possibility of identifying QTLs 
relevant for important wood traits in an established RIL popula- 
tion of Arabidopsis as a first step towards a molecular understand- 
ing of the underlying cellular mechanisms. Our reproducible 
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mapping of two QTLs controlling fibre length in Arabidopsis 
accessions have demonstrated the feasibility of the approach, 
as well as of the high-throughput phenotyping technology that 
was employed in the study. Rapidly developing mass-sequenc- 
ing and positional cloning technologies are well suited to move 
on towards the identification of the molecular alleles underly- 
ing these two QTLs in the future. At the same time, Arabidopsis 
QTL analysis can be expanded towards accessions with more 
strongly divergent fibre traits. 

Our results for genetic control of lignin content seemed to 
indicate a limited genetic influence, but it was still sufficient to 
successfully perform QTL mapping. The results of CIM analy- 
sis were similar in the Ligninl and Lignin2 experiments (Figs 
4 and 6), which together identified the first QTL affecting lignin 
content in the Arabidopsis stem. As with fibre properties, studies 
on genetic control of lignin content have been conducted in other 
species, such as pine (Sewell et at, 2002), maize (Cardinal et ah, 
2003 ; Krakowsky et al. , 2006, 2005), barley (Grando et al. , 2005) 
and rice (Xie et al, 2011), research that might be complemented 
with molecular access to the relevant genes in Arabidopsis. As 
part of our mapping efforts, we have incorporated a novel, micro- 
scale, acetyl bromide-based method for determination of lignin 
content (Chang et al, 2008) developed for assaying large num- 
bers of Arabidopsis plants. The results reported here encourage 
its application for screening a wider range of Arabidopsis acces- 
sions to identify extreme divergences of lignin content in this spe- 
cies. Genetic analysis of such genotypes may in turn lead to the 
identification of more numerous and stronger QTLs for this trait. 

Previous studies have identified a number of factors influ- 
encing fibre properties, including hormonal regulation (Zhong 
and Ye, 2001; Dayan et al, 2010; Ragni et al, 2011), activa- 
tion of a number of key transcription factors (Ko et al, 2007; 
Yamaguchi et al, 20 1 1 ) and cytoskeletal functions (Burk and Ye, 
2002). Although mutations in any of these pathways could affect 
fibre length, and thus generate the spurious impression that in the 
background of all these known influences no new determinants 
of fibre length might be found, this is not an uncommon situation 
for new genetic searches. Such searches, however, regularly find 
new influences and their underlying molecular bases, which can 
then be linked to the network of previously known mechanisms. 

Whether the QTLs affect genes with a known relationship to 
fibre properties or entirely new genes, we can exclude that there 
is any obvious hormonal, cytoskeletal, or other known influ- 
ence overtly segregating in the Col-4xLer-0 RIL population 
that would just impinge on fibre properties. Furthermore, we 
explored the fidelity of FQA measurements by comparing them 
with microscopically measured fibre lengths in a large number of 
Arabidopsis accessions (R.R Chandra, H.X. Chang, G. Soong, 
A. Capron, T. Berleth, and R.R Beatson, unpublished data). We 
also did not observe any correlation with plant height or stem 
elongation, either in this study or in a systematic survey of 150 
Arabidopsis accessions (R.R Chandra, H.X. Chang, G. Soong, 
A. Capron, T. Berleth, and R.R Beatson, unpublished data). This 
is supported by the fact that there is no straightforward correla- 
tion between plant height and cell size. For example, mutants 
defective in fibre development are not necessarily reduced in 
height (Zhong et al, 1997). Furthermore, although the erecta 
mutation reduces plant height, cortex cells in the mutant have 



been described to be longer (Torii et al, 1996). In conclusion, 
the absence of any recognizable correlation between fibre length 
and any other trait in the segregating population underscores the 
need to find relevant genes underlying the QTLs FQ2 and FQ5 
to gain a more comprehensive understanding of fibre length 
regulation. 

It is noteworthy that, for both the fibre length and lignin con- 
tent traits, the variation within the population vastly exceeded 
the variation between the parental genotypes, a phenomenon 
that has been observed in other similar studies involving RIL 
populations (Reymond et al, 2006; Coluccio et al, 2011; 
Sanyal and Randal Linder, 2011). Multiple QTLs may mask 
each other's effects within the genome of a parental line but then 
become fully recognizable in a segregating population. Thus, 
it is worthwhile searching for QTLs in established RILs first, 
which provides the advantages of a high density of DNA mark- 
ers and thorough characterization of the parental genotypes. In 
the case of Col and Ler, the two most widely used Arabidopsis 
accessions, their fully sequenced genomes and the availability 
of bacterial artificial chromosome (BAC) libraries are invalu- 
able aids for identifying the molecular basis of each QTL. The 
ability of one of these BACs (or an overlap of several BACs) to 
confer altered fibre length properties to another accession would 
be the entry point for the molecular genetic dissection of fibre 
length control. 

Association genomics studies are presently underway in 
woody species such as Populus and Eucalyptus, whose genomes 
have been fully sequenced. These analyses seek to correlate sin- 
gle-nucleotide polymorphism allele patterns with variance in a 
wide range of wood properties, including fibre length and lignin 
content (reviewed by Nieminen et al, 2012; Mizrachi et al, 
2012). As gene sequence and overall synteny are largely con- 
served between Arabidopsis and Populus (Tuskan et al, 2006), 
for example, it will be interesting to compare the results of such 
genome-wide analyses with the gene content of the Arabidopsis 
QTL intervals detected here. 

Supplementary data 

Supplementary data are available at JXB online. 

Supplementary Table SI. ANOVA testing for correlation 
between traits and selected markers. 

Supplemental Table 2: List of candidate genes for the three 
mapped QTLs. 
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